153 research outputs found

    Efficient pebbling for list traversal synopses

    Full text link
    We show how to support efficient back traversal in a unidirectional list, using small memory and with essentially no slowdown in forward steps. Using O(logā”n)O(\log n) memory for a list of size nn, the ii'th back-step from the farthest point reached so far takes O(logā”i)O(\log i) time in the worst case, while the overhead per forward step is at most Ļµ\epsilon for arbitrary small constant Ļµ>0\epsilon>0. An arbitrary sequence of forward and back steps is allowed. A full trade-off between memory usage and time per back-step is presented: kk vs. kn1/kkn^{1/k} and vice versa. Our algorithms are based on a novel pebbling technique which moves pebbles on a virtual binary, or tt-ary, tree that can only be traversed in a pre-order fashion. The compact data structures used by the pebbling algorithms, called list traversal synopses, extend to general directed graphs, and have other interesting applications, including memory efficient hash-chain implementation. Perhaps the most surprising application is in showing that for any program, arbitrary rollback steps can be efficiently supported with small overhead in memory, and marginal overhead in its ordinary execution. More concretely: Let PP be a program that runs for at most TT steps, using memory of size MM. Then, at the cost of recording the input used by the program, and increasing the memory by a factor of O(logā”T)O(\log T) to O(Mlogā”T)O(M \log T), the program PP can be extended to support an arbitrary sequence of forward execution and rollback steps: the ii'th rollback step takes O(logā”i)O(\log i) time in the worst case, while forward steps take O(1) time in the worst case, and 1+Ļµ1+\epsilon amortized time per step.Comment: 27 page

    Efficient Bundle Sorting

    Get PDF
    AMS subject classification. 68W01 DOI. 10.1137/S0097539704446554Many data sets to be sorted consist of a limited number of distinct keys. Sorting such data sets can be thought of as bundling together identical keys and having the bundles placed in order; we therefore denote this as bundle sorting. We describe an efficient algorithm for bundle sorting in external memory, which requires at most c(N/B) logM/B k disk accesses, where N is the number of keys, M is the size of internal memory, k is the number of distinct keys, B is the transfer block size, and 2 < c < 4. For moderately sized k, this bound circumvents the Ī˜((N/B) logM/B(N/B)) I/O lower bound known for general sorting. We show that our algorithm is optimal by proving a matching lower bound for bundle sorting. The improved running time of bundle sorting over general sorting can be significant in practice, as demonstrated by experimentation. An important feature of the new algorithm is that it is executed ā€œin-place,ā€ requiring no additional disk space

    Efficient Bundle Sorting

    Get PDF
    This is the published version. Copyright Ā© 2006 Society for Industrial and Applied MathematicsMany data sets to be sorted consist of a limited number of distinct keys. Sorting such data sets can be thought of as bundling together identical keys and having the bundles placed in order; we therefore denote this as bundle sorting. We describe an efficient algorithm for bundle sorting in external memory, which requires at most c(N/B) logM/Bk disk accesses, where N is the number of keys, M is the size of internal memory, k is the number of distinct keys, B is the transfer block size, and 2 < c < 4. For moderately sized k, this bound circumvents the Theta((N/B) logM/B (N/B)) I/O lower bound known for general sorting. We show that our algorithm is optimal by proving a matching lower bound for bundle sorting. The improved running time of bundle sorting over general sorting can be significant in practice, as demonstrated by experimentation. An important feature of the new algorithm is that it is executed "in-place," requiring no additional disk space

    Approximate Data Structures with Applications

    Get PDF
    In this paper we introduce the notion of approximate data structures, in which a small amount of error is tolerated in the output. Approximate data structures trade error of approximation for faster operation, leading to theoretical and practical speedups for a wide variety of algorithms. We give approximate variants of the van Emde Boas data structure, which support the same dynamic operations as the standard van Emde Boas data structure [28, 201, except that answers to queries are approximate. The variants support all operations in constant time provided the error of approximation is l/polylog(n), and in O(loglog n) time provided the error is l/polynomial(n), for n elements in the data structure. We consider the tolerance of prototypical algorithms to approximate data structures. We study in particular Primā€™s minimumspanning tree algorithm, Dijkstraā€™s single-source shortest paths algorithm, and an on-line variant of Grahamā€™s convex hull algorithm. To obtain output which approximates the desired output with the error of approximation tending to zero, Primā€™s algorithm requires only linear time, Dijkstraā€™s algorithm requires O(mloglogn) time, and the on-line variant of Grahamā€™s algorithm requires constant amortized time per operation

    Dynamic Generation of Discrete Random Variates

    Get PDF
    The original publication is available at www.springerlink.comWe present and analyze efficient new algorithms for generating a random variate distributed according to a dynamically changing set of N weights. The base version of each algorithm generates the discrete random variate in O(log N) expected time and updates a weight in O(2log N) expected time in the worst case. We then show how to reduce the update time to O(log N) amortized expected time. We nally show how to apply our techniques to a lookup-table technique in order to obtain expected constant time in the worst case for generation and update. We give parallel algorithms for parallel generation and update having optimal processor-time product. Besides the usual application in computer simulation, our method can be used to perform constant-time prediction in prefetching applications. We also apply our techniques to obtain an eƆcient dynamic algorithm for maintaining an approximate heap of N elements, in which each query is required to return an element whose value is within an multiplicative factor of the maximal element value. For = 1=polylog(N), each query, insertion, or deletion takes O(log log logN) time

    Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors

    Get PDF
    This paper considers issues of memory performance in shared memory multiprocessors that provide a high-bandwidth network and in which the memory banks are slower than the processors. We are concerned with the effects of memory bank contention, memory bank delay, and the bank expansion factor (the ratio of number of banks to number of processors) on performance, particularly for irregular memory access patterns. This work was motivated by observed discrepancies between predicted and actual performance in a number of irregular algorithms implemented for the cray C90 when the memory contention at a particular location is high. We develop a formal framework for studying memory bank contention and delay, and show several results, both experimental and theoretical. We first show experimentally that our framework is a good predictor of performance on the cray C90 and J90, providing a good accounting of bank contention and delay. Second, we show that it often improves performance to have addi..
    • ā€¦
    corecore